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Three steps aid in the analysis of selection. First, describe phenotypes by their component causes. 
Components include genes, maternal effects, symbionts, and any other predictors of phenotype 
that are of interest. Second, describe fitness by its component causes, such as an individual's 
phenotype, its neighbors' phenotypes, resource availability, and so on. Third, put the predictors 
of phenotype and fitness into an exact equation for evolutionary change, providing a complete 
expression of selection and other evolutionary processes. The complete expression separates the 
distinct causal roles of the various hypothesized components of phenotypes and fitness. Traditionally, 
those components are given by the covariance, variance, and regression terms of evolutionary models. 
I show how to interpret those statistical expressions with respect to information theory. The resulting 
interpretation allows one to read the fundamental equations of selection and evolution as sentences 
that express how various causes lead to the accumulation of information by selection and the decay 
of information by other evolutionary processes. The interpretation in terms of information leads to 
a deeper understanding of selection and heritability, and a clearer sense of how to formulate causal 
hypotheses about evolutionary process. Kin selection appears as a particular type of causal analysis 
that partitions social effects into meaningful componentflF] 



The path method ... is not so much con- 
cerned with prediction as [it is with] the pro- 
posal of a plausible interpretation of the re- 
lationships between the variables. In other 
words, path analysis is concerned with erect- 
ing a causal structure compatible with the ob- 
served data [J p. 3]. 



INTRODUCTION 

Populations accumulate information by natural selec- 
tion. The amount of information may be expressed by 
classical information theory {2j- That purely informa- 
tional expression describes phenotypes and fitness ab- 
stractly, without consideration of the explicit causes that 
determine phenotypic traits and their association with 
fitness. Here, I partition phenotypes and fitness into their 
component causes. 

For phenotypes, we must track the influence of genes, 
symbionts, maternal effects and other potential causes. 
The components of phenotype lead to explicit models 
of character expression and heritability. For fitness, we 
must track how different characters and external forces 
combine to determine success. An individual's fitness 
may, for example, depend on a combination of its own 
phenotype and the phenotypes of its neighbors. 

I put those explicit causal components of phenotype 
and fitness into the fundamental expressions of selection 
and evolutionary change. I recover an expanded concept 
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of heritability, a precise understanding of Fisher's funda- 
mental theorem, and a general form of the equations of 
selection for multiple characters. With those tools, the 
following article clarifies kin selection and other social 
processes [3]. 

I presented much of this material in Frank [4j [5] . Here, 
I pursue four goals. First, I express the key partitions 
of phenotypes and fitness with respect to my new infor- 
mation theory interpretation of selection [2] . Second, the 
information expressions translate the traditional regres- 
sion and variance terms of selection into more meaning- 
ful descriptions of cause and consequence. Third, the 
partitions of phenotype and fitness provide the basis for 
replacing outdated concepts of kin selection with a solid 
conceptual foundation [3J. Fourth, I emphasize simplic- 
ity, presenting the mathematical material at the most 
basic level consistent with the concepts. The original 
publications contain more detail [H [5] . 

Mathematically, little is required beyond simple forms 
of statistical regression and the location of points in coor- 
dinate systems. Although I use only basic mathematics, 
the article is nonetheless challenging. I cover a wide ar- 
ray of problems at a very general level, with emphasis 
on the connections between seemingly different topics. 
That sustained abstraction and synthesis provide both 
significant rewards and demanding challenges. 

It may seem that the basic problems of selection and 
kin interactions were solved long ago. Why do we need 
to revisit those topics? In fact, our understanding of 
natural selection and kin selection has continued to ad- 
vance over the past few decades. Those advances have 
developed while the old formulations have remained. The 
core of the subject has become cluttered with incompat- 
ible expressions from different eras, derived in different 
contexts. One can no longer go forward without first 
resetting the foundations. 
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Box 1. Topics in the theory of natural selection 



This article is part of a series on natural selection. Although 
the theory of natural selection is simple, it remains endlessly 
contentious and difficult to apply. My goal is to make more ac- 
cessible the concepts that are so important, yet either mostly 
unknown or widely misunderstood. I write in a nontechni- 
cal style, showing the key equations and results rather than 
providing full derivations or discussions of mathematical prob- 
lems. Boxes list technical issues and brief summaries of the 
literature. 



SELECTION 

I briefly review the general equations for selection and 
evolution. Recent articles in this series provide full de- 
tails HE]. 

The Price equation 

Consider an initial population. Let z be the average 
in the population of some value (phenotype). A second 
population has average value z'. Total change between 
the populations is Az = z' — z. Split the total change 
into two components 

Az = A s z + A c z. (1) 

The first term, A s , is the part of the total change caused 
by selection. The second term, A c , is the remaining part 
of total change by all other causes. 

To evaluate these terms, we write the average value as 
z = Ylli z i' The index i divides the population in any 
way that we choose. We may use i to label by different 
individuals, by different groups, by genotype, or by any 
other partition of the population. The frequency of a 
type i in the population is q^. The phenotype associated 
with i is z%. The average value in the second population 

is z' =J2<li z i- 

We define selection as changes in frequency, holding 

constant phenotype 

A s z = - ^QiZi- 

Here, the populations differ in their frequencies, Aqi — 
q[ — qi, but we have held the phenotype values constant at 
Zi in both populations. Using Aqi for frequency change, 
we write 

A s z = ^Aq lZl . (2) 

To obtain the total change, we need the changes in 
phenotype holding constant the frequencies 



Box 2. Price equation: difference of a product 



The Price equation simply expands a difference into multiple 
terms. Consider, for example, the difference of the product 
of x and y, which we write as A (xy) = x'y' — xy. We can 
expand the difference of the product as 

A(xy) = (x + Ax) (y + Ay) - xy 

which yields 

A(xy) = (Ax) y + x (Ay) + AxAy. 

This expression shows that the difference of a product is the 
difference of the first term holding the second term constant, 
plus the difference of the second term holding the first term 
constant, plus the product of the two differences. 

We can simplify the difference expansion by combining a 
pair of terms on the right-hand side. Noting that x' = x + Ax, 
we can combine the last two terms into one, yielding 

A(xy) = (Ax) y + x' (Ay). 

The derivation of the Price equation follows the rule for the 
difference of a product 

Az = A^ftZj 

= ^2(Aq i )Zi+^2q! i (Az i ). 

The value of the Price equation arises from identifying 
X] (Aqi) Zi as the part of total change caused by selection. 
Selection acts on phenotype at a fixed point in time, so it 
makes sense to consider selection as the partial difference in 
frequency holding phenotype constant. When we use log fit- 
ness for the phenotype, m = z, we get an exact correspon- 
dence between the selection term and the increase in informa- 
tion expressed by classical information theory (eqnj8|. That 
correspondence supports interpreting ^ (Aqi) Zi as selection. 



Here, the populations differ in their phenotype, Azi — 
z[ — Zi, but we have fixed the frequency at q\. We use 
the final frequencies in the second population, q' , because 
they provide the proper reference for final phenotype af- 
ter change (Box|2|. Using Azi for phenotypic changes, 
we write 

A c z = y^g-Azj. 

The total change from eqn[l]can now be written (Box [2]) 
as a form of the Price equation 



Az = J2 A H z i + J2 q i Azi - 



(3) 
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Classical expressions of covariance, regression and 
variance 



The definition of fitness is 



qi = qi—, 
w 



(4) 



where Wi is the fitness of type i, and w is average fitness. 
The change in frequency is 



Aq t = qi ( -4 - 1 



Thus, the change caused by selection can be written as a 
covariance between fitness and phenotype 

^ Aq t Zi = ^2 Qi — l)zi = Cov(w, z)/w. (5) 

We can rewrite a covariance as a product of a regression 
coefficient and a variance term 



A s z = Cov(w, z)/w = /3 zw V w /w, 



(6) 



where f3 zw is the regression of phenotype, z, on fitness, 
w, and V w is the variance in fitness. Selection equations 
are often expressed with these covariance, regression and 
variance terms. Classical population genetics expressions 
for change in gene frequency also have this form, in which 
we let z = p be the frequency of a gene in a population. 



INFORMATION 

Frank [2J showed that selection can be expressed in 
terms of information theory. I briefly review the key 
points in this section. 



Fitness and the gain in encoded information 

Fitness, w, describes relative changes in frequency. 
Logarithms provide the natural scaling for relative 
changes. Using the expression for fitness in eqn |4j we 
write log fitness as 



nii = log(wi) = log(w) + log 



Using z = m in the expression for selection (eqn [2]) , we 
have 



A s m = ^2 Aftmi = ^ Aqi log 



The classic information theory expression for the change 
in encoded information between two populations with fre- 
quencies q' and q is 



wii?) = E^og 



(7) 



With that definition, we have 

A s m = V(q'\\q) + V(q\\q'), 

in which the right hand side is known as the Jeffreys 
information divergence, J. Thus, we can write the fun- 
damental expression for the accumulation of information 
by natural selection as 



A,m = J. 



(8) 



Because z in eqn[6]is just a placeholder for any character, 
we can use m in place of z in that equation, yielding 

A s fh = l3 mw V w /w. 

Thus, the information accumulated by natural selection, 
J, is equivalently expressed in terms of the regression 
coefficient and the variance, 



J = P mw V w /w. 



Variance, regression and information 



(9) 



The variance in fitness, V w , is proportional to the in- 
formation gain by natural selection, J (eqn [9]). It is easy 
to understand why selection may be expressed in terms 
of information. Selection is, in essence, a process by 
which populations gain information about the environ- 
ment. But why should the variance arise as an alterna- 
tive description of selection? 

The usual view is that selection acts on differences 
within the population. The greater the differences, the 
larger the variance and the greater the opportunity for 
selection. But why exactly is the variance the correct 
measure of differences within the population, rather than 
some other measure of variation? 

Consider the definition of fitness (eqn H| given earlier 



w 



in which the relative fitness is the ratio of frequencies 
between the new and old population. Relative fitness 
is, in essence, a measure of the separation between the 
new population and the old population, a comparison of 
q 1 versus q. Because the frequencies in each population 
must add to one, each separation between a pair q[ and qi 
must be balanced by opposite separations in other pairs. 

Thus, the variation in the q\jqi ratios measures the 
total separation of the new population from the old pop- 
ulation. In particular, the variance in those ratios — the 
variance in fitness — is like a distance between the new 
population and the old population. That distance-like 
measure has units in terms of the information gain [2]. 
The variance in fitness expresses an informational dis- 
tance, the amount of information gained by selection. 

Information gain is measured on the logarithmic scale 
of frequency changes (eqn [7]). The regression coefficient, 
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Box 3. Regression 



Simple regression is based on the equation for a line 

z = a + fiy, 

in which z is the outcome of interest, y is a variable that is 
used to predict z, the term /3 is the slope of the line relating 
z to y, and a is the intercept, which is the value of z when 
y — 0. The simple regression model is usually written as 

Zi = a + I3 zy yi + St, 

in which the i subscripts denote values associated with dif- 
ferent observations, and Si is the residual as described below. 
In some applications, it is convenient to make the intercept a 
disappear, which we achieve by yi — Xi — a//3 zy , which gives 

Zi — ftzx^i ~t~ Si. 

This expression is equivalent to the previous one. The only 
change is that x differs from y by a constant value. The 
second expression uses f3 zx in place of /3 zy . Those terms have 
the same value, but I use the term with x to emphasize that 
the relation is now between z and x. In any regression model, 
we can make a similar substitution in which we change y by 
a constant factor to get an x value that makes the intercept 
disappear. 

From the perspective of regression analysis, fi zx x provides 
a prediction of z given x. The difference between the actual 
value and the predicted value is the residual (error), Si — Zi — 
P zx x. Two changes in notation provide a cleaner expression. 
Write the regression coefficient as b — j3 zx , and drop the i 
subscript, yielding 

z = bx + 5, 

where the variables implicitly range over i. 

Regression has a natural asymmetry. In prediction, the 
value of z is the predicted value given the predictor, x. In a 
causal interpretation, in the sense of path analysis (Box[5|, the 
effect z depends on the cause, x. One must keep this asymme- 
try in mind to interpret regression equations correctly. Proper 
notation helps. We may write 

z\x — bx + S, 

which emphasizes that the outcome, z, depends on the given 
fixed value of x. We read z\x as "z given x." If we take the 
average of both sides 

E(«|a;) = bx, 

where E(z|x) is the expectation of z given x, in which "ex- 
pectation" means the average value. On the right side, 5 
disappears because the regression coefficient, 6, is chosen so 
that the average value of the residual is zero, 5 = 0. 



fimw, transforms fitness from the linear scale, w, to the 
log scale, m, yielding the key expression given earlier for 
the change in log fitness (information) caused by selection 

A s m = J = P mw V w /w. 

It is common to think of a regression coefficient as a lin- 



ear prediction estimated from data. That interpretation 
misleads with regard to understanding the fundamental 
equations of selection. Instead, the regression coefficient 
describes the consequence for the change in average value 
when transforming from one scale to another scale (Boxes 
[3]and|4|. The proper way to read /3 mw is a change in scale 
from w to m when evaluating the averages w and to. 



Phenotype as a change in the scaling of information 

Selection causes populations to accumulate informa- 
tion. The measure of information is related to log fitness. 
In the analysis of selection, we often focus on phenotypes 
rather than fitness. Here, I show that, with respect to 
selection, one can think of the phenotypic scale simply as 
an alternative scale on which to measure information. 

Begin with the expression given earlier for the change 
in log fitness 

A s fh = /3 mw V w /w. 

The regression coefficient, j3 mw , changes scale from fit- 
ness, w, to log fitness, to. If we divide by /3 mw , we obtain 



A s to 



V w /w. 



The factor 1 / f3 mw reverses the scale change, transforming 
from the logarithmic scale, to, to the linear scale, w. 
The change in phenotype from eqn|6]can be written as 

A s z = p zw V w /w. 

The regression f3 xw changes scale from fitness, w, to phe- 
notype, z, and l/fizw reverses the direction of the change 
in scale. Thus 



A s z 



V w /w 



A s m 



Because the information accumulated by natural selec- 
tion is A,m = J, we have 



a, i i : ,r I 



This expression describes the change in phenotype by se- 
lection in relation to the information gain, J, rescaled 
by the transformation from the scale of information, to, 
to the scale of phenotype, z. We may describe the scal- 
ing between the gain in information, J, and change in 
phenotype caused by selection, A s jj, as 



0u 



(10) 



Thus we can write the relation between the change in 
phenotype and the gain in information as 



A s z = a z J. 



(11) 
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Box 4. Change in scale 



Box 4 



continued 



In the regression model (Box[3| with subscripts used explicitly 
for labeling types 

E(zi\xi) = bxi. 

If we consider subscripts for two different types, k and i, we 
can write E(z k \x k ) = bx k and E(zi\xi) = bxi. Subtracting 
these two equations from each other gives 

E(z k — Zi\xk — xi) = b(x k — xi). 

Using A to denote a change between the k and i values 

E(Az| A:r) = b(Ax), 

which we can write equivalently as 



b = /3 ZX = 



E(Az\Ax) 
Ax 



which we read as: "the regression of z on x is the expected 
change in z for a given change in x divided by the change in x." 
From this expression, we see that a regression coefficient is the 
expected change in scale for one variable in relation to another 
variable. One can also think of the regression coefficient as 
a sort of generalization of differentiation. For situations in 
which we can consider z and x as continuous variables with 
an underlying functional relationship, z(x), it will often be the 
case that, as the changes become small, Az — > and Ax — > 
with x confined to a small range of values, then the regression 
coefficient approaches the derivative, /3 ZX — > Az/dx. 

Finally, the variables x and S are uncorrelated, so that 
Cov(a;, S) — 0. Regression uses all of the available informa- 
tion in x about z. Thus, any left over deviations, 8, cannot 
contain information about z, which is reflected in the lack of 
correlation between those variables. 

When we have multiple predictors, or causes, Xj = 
xi, X2, ■ ■ ■ , x n , then the regression equation is 



= ^2bjXj +6, 



where each bj is the partial regression of z on Xj , holding con- 
stant the other predictor values. Suppose, for example, that 
we have two predictors, xi and X2- For notational conve- 
nience, let x = xi and y = X2, so that the regression equation 
is 

z = b x x + b y y + 5. 

If, as above, we take the difference between two x values, 
holding y constant, we obtain 



bx — ftzx-y — 



E(Az\Ax,y) 



which we read as: "the regression of z on x, holding y con- 
stant, is the expected change in z for a given change in x and 
a fixed value of y, divided by the change in x." This expres- 
sion gives the expected change in scale between z and x for a 
given value of y. If z, x, and y are continuous variables with 
an underlying functional relationship, z(x,y), then for small 
changes confined to a small range of predictor values for x 
and y, it will often be the case that the regression approaches 
the partial derivative /3 zx . y — > dz/dx. 



These properties of regression follow from least squares. The 
squared distance between predicted and observed values is the 
sum of squares, ^ Minimizing that distance gives the least 
value for the sum of squares — the least squares. All proper- 
ties here follow from that minimization. Further aspects of 
regression depend on other assumptions. For example, many 
tests of statistical significance assume that the residuals have 
a normal distribution. Certain interpretations require that 
the observations be linearly related to the predictors. I do 
not use those further aspects and therefore do not require 
any assumptions about linearity or the distribution of obser- 
vations and residuals. 



CAUSES OF PHENOTYPE 

This section partitions the causes of phenotype into 
components. The next section connects the causes of 
phenotype to the capture and transmission of informa- 
tion. The following section partitions fitness into compo- 
nents, dividing the gain in information by selection into 
different causes. Boxes [3]-[6] provide background on re- 
gression. Box [7] provides citations to the literature. 



Overview 

Heritability describes the expected similarity in phe- 
notype between different individuals |7j. For example, 
we may define the predictors of phenotype as the set of 
alleles in an individual, and the heritability as the part 
of similarity between ancestors and descendants ascribed 
to those alleles. Because sex and recombination break 
up particular combinations of alleles, adding up the ef- 
fects of each individual allelic predictor often provides a 
good estimate of the similarity between different relatives 
caused by genetics. 

Alternatively, we may expand the set of predictors to 
include certain nonlinear combinations of alleles. For ex- 
ample, we may have a predictor for the presence of allele 
A, another for the presence of allele B, and a third for the 
presence of both alleles. Certain expanded predictor sets 
may give a more accurate description of similarity be- 
tween closely related ancestor-descendant pairs that are 
likely to share the allelic combinations, but may give a 
less accurate description when the allelic pairs tend to be 
broken up during transmission. 

Here, I am primarily interested in the information that 
a population accumulates by selection, and how different 
processes may reduce or alter the transmission of accu- 
mulated information. My expressions include the classic 
genetic measures as special cases. But I do not empha- 
size the connection to traditional genetics — the genetic 
interpretations are discussed in every basic textbook of 
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Box 5. Causes and predictors 



Since path analysis depends on structure, and 
structure in turn depends on the cause-and-effect 
relationship among the variables, we shall first 
say a few words about the way these terms will 
be used . . . There are a number of formal defini- 
tions as to what constitutes a cause and what an 
effect. For instance, one may think that a cause 
must be doing something to lead to something 
else (effect). While this is clearly one type of 
cause-and-effect relationship, we shall not limit 
ourselves to that type only. Nor shall we enter 
into philosophical discussions about the nature of 
cause-and-effect. We shall simply use the words 
"cause" and "effect" as statistical terms similar 
to independent and dependent variables, or [pre- 
dictor variables and response variables] [T] p. 3]. 

I analyze causes of phenotypes and causes of fitness. Here, 
I briefly comment on the word "cause." The above quote 
and the epigraph come from Li's book on Path Analysis. Li's 
point concerns the distinction between three levels of analysis. 
First, true causality describes the relations between actual 
forces and actual effects. Whether such things can ever be 
studied or known directly remains a philosophical problem 
beyond our scope. 

Second, at the other extreme, multiple regression analy- 
sis from classical statistical theory concerns only correlations 
and variances. The standard theory explicitly disavows causal 
interpretation — correlation is not causation. Regression arises 
by minimizing the distance between predicted outcomes and 
actual outcomes — an attempt at optimal prediction. One 
thinks of the variables used to predict outcome simply as pre- 
dictors that, in the past, would have helped one to make a 
better guess about what actually happened. The predictors 
may have direct effects themselves or be correlated with some 
other unseen causal factor. However, those notions of direct 
and unseen cause are irrelevant to the method. 

Third, path analysis takes an intermediate approach. One 
chooses the predictors for a model as a hypothesis about 
cause. Rather than aim for optimal prediction, one aims 
for a set of variables that consistently describe the observed 
patterns of variation. The quality of the causal interpreta- 
tion is primarily evaluated by the consistency of the hypothe- 
sized pathways in capturing the observed variance in outcome. 
Consistency roughly means relative stability in the magnitude 
of a pathway's effect under different circumstances. Although 
that interpretation potentially offers some insight into cause 
and effect, the analytical method remains multiple regression. 
One simply emphasizes the quality of a model as a potential 
causal interpretation rather than as an attempt at optimal 
prediction. 

Consider a model in which we use genes as predictors 
of phenotype. In a breeding program to improve yield, we 
want to predict offspring phenotype in order to make the 
best choice of breeding design. Causality is irrelevant, we 
aim only for a good outcome. By contrast, in a theoretical 
analysis of adaptation by natural selection, we want to 
understand the causal processes. How do the genes that 
affect phenotype combine to determine morphology or 
behavior? How does selection influence the underlying 
genes and the resulting phenotypic design in relation to 
performance? We are after an understanding of the process. 
The quality of prediction will, of course, be the primary 
way to interpret the causal model. But a good prediction 



Box 5 — continued 



arising from the wrong underlying causal model is what we 
most want to avoid. Prediction becomes a method for evalu- 
ation rather than the goal. 

This article analyzes natural selection in relation to causal 
interpretations. For that reason, I think of my models of mul- 
tiple regression as models of path analysis. In a different con- 
text, the same models could be thought of strictly as analyses 
of regression and prediction. 



genetics [7]. Instead, I focus on general equations for 
selection and the transmission of information. In my ex- 
pressions, any predictors can be used including, but not 
limited to, all of the traditional genetic forms. 

Why bother with such abstractions? Because many 
extensions to basic genetic theory have been developed 
to cope with nongenetic effects or to analyze selection in- 
dependently of genetics [5] . The literature tends to deal 
with each particular problem as a novel challenge that re- 
quires special theory. For example, maternal effects, kin 
selection, cultural evolution and institutional evolution 
in economics all have their distinct literatures and ways 
of framing problems. Yet all of those problems are just 
examples of a general theory of selection and transmis- 
sion. In any particular application, the key is to express 
the causes of phenotypes (characteristics) and the causes 
of fitness (success) by a model, or hypothesis, of how var- 
ious predictors combine to determine outcome. A general 
theory expressed in terms of any choice of predictors de- 
fines the unifying conceptual framework [U [5] . 

Fisher's average effect 

We can separate phenotype into components by 

Zi * bj x^ij -(- 5i . 
j 

Each type, i, has n different associated Xi values, 
Xn, X{2, ■ ■ ■ , Xi n . From the perspective of multiple re- 
gression, the x's are predictors, or independent variables, 
with respect to the phenotype, z. Each bj is a partial 
regression coefficient of z on Xj. Roughly speaking, a 
partial regression coefficient, bj, describes the average 
change in phenotype, z, for a change in the associated 
predictor variable, Xj. 

We often focus on the general relation of a phenotype, 
z, to its components, xj, rather than on the particular 
phenotype, Zi, of a particular type, i, in relation to its 
particular components, Xij. Thus, we may express the 
general relation between a phenotype and its components 
as 

j 
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in which one understands that the particular values of 
z, Xj, and S vary for the different types, i, whereas the 
average effect of a predictor, bj, is a property of the pop- 
ulation. 

The regression expression applies to any predictors, Xj. 
We could use temperature, neighbors' behavior, another 
phenotype, epistatic interactions given as the product of 
allelic values, symbiont characters or an individual's own 
genes. Fisher first presented this regression for pheno- 
type in terms of alleles. Suppose each Xj is the presence 
or absence of an allelic type. Then each bj describes 
the average contribution to phenotype for adding or sub- 
tracting the associated allelic type, and bj is called the 
average effect [DEO [TO]. 

Predicted phenotype is 

3 

In genetic contexts, g is often called the breeding value 
0- Using g, we can partition phenotype into a predicted 
component and a residual component 

z = g + S, (13) 

where 6 — z — g is the difference between the actual value 
and the predicted value. If we take the average of both 
sides, we get z = g, because <5 = 0. 



The components of heritability 

The part of phenotype not transmitted 

Typically, we only follow the transmission of the pre- 
dictors. For example, we may follow transmission of 
genes plus any other variables we choose. Those effects 
that wc include explicitly end up as part of the predicted 
phenotype, g, and as candidates for the transmitted phe- 
notype. All effects on phenotype not explicitly included 
as predictors end up in the residual, S. The split between 
the predicted phenotype and the residual is arbitrary. If 
we add a new predictor, any additional effect of that pre- 
dictor moves from the residual, 5, to the predicted pheno- 
type, g. Usually, we wish to give the best description of 
the causes of phenotype that we can. Thus, our choice of 
predictors defines our hypothesis about the causes of phe- 
notype, in the sense of path analysis discussed in Box [5] 

The part of phenotype associated with the particular 
set of predictors, g, defines one component of heritability. 
Aspects of phenotype not associated with the particular 
predictors in our model appear as a nontransmitted com- 
ponent of phenotype, 5, reducing the similarity of pheno- 
type between ancestors and descendants associated with 
the predictors. 



Box 6. Nonlinearity 



Regression and path analysis are sometimes thought to be 
limited to linear and additive effects. However, that is mis- 
leading. Consider z = bx + 8. Here, b is the linear relation 
between x and z. However, it may be that x = y 2 , in which 
the true underlying cause is y. Thus, we are actually regress- 
ing on a nonlinear function of a causal variable, y. Or, it may 
be that we start with z — b\Xi -\-b2X2 +63X3 + S. This appears 
to be an additive model. However, the underlying cause may 
be x\ — yi, and X2 = J/2, and x$ =2/1*3/2- Thus, our model 
expresses nonlinearity and nonadditivity in the causes, y. 

In general, any nonlinear relation can be expressed by an 
additive sum of terms, in which the individual terms may be 
nonlinear. Thus, regression can fully account for any nonlin- 
earity by an additive sum of terms. In practice, limitations 
arise because we may not know the correct nonlinear relation, 
and so cannot express the proper sum of nonlinear terms. 
However, that is not a limitation of regression, but rather a 
limitation that arises from our ignorance. Another method 
of analysis does not solve the problem of our ignorance. The 
point is that one must distinguish limitations arising from 
method from limitations arising from ignorance. Confusing 
those different limitations is a common mistake. 



Change in transmitted components of phenotype 

A second component of heritability arises from the sta- 
bility of the effects associated with the predictors. If a 
predictor has effect bx in the original population and ef- 
fect b'x' in the second population, then the transmission 
of that predictor is associated with a change in pheno- 
type A(bx) = b'x' — bx. Box[2]shows that we can express 
this change as 

A(bx) = (Ab) x + b' (Ax) . 

Summing over the j different predictors and using the 
definition of g from eqn |12| yields 

Ag = ^2(Ab j ) X j + J2b'j( Ax j)- ( 14 ) 

On the right side, the first term describes the change in 
the predicted value of a type that arises from the changes 
in the average effects of the predictors, Abj, holding con- 
stant the predictor values, Xj. For example, the average 
effect of an allele on phenotype may be frequency de- 
pendent. Thus the average effect will change over time 
as the frequency of the allele changes in the population. 
The second term describes the change in the transmit- 
ted predictor values, Axj, evaluated in the context of 
the average effects from the second population, bj. For 
example, an allele may mutate into another form, thus 
weighting the average effect by a different amount. 

The smaller the Ab and Ax values, the less the phe- 
notype changes with respect to the transmitted predic- 
tors, and the higher the heritability associated with those 



predictors. Equivalently, the more stable the predictors 
and their average effects, the greater the fidelity at which 
those particular predictors transmit the information ac- 
cumulated by selection to the new population. 

The change in the predictors, Ax, includes mutation 
as well as any other process that alters predictor values 
[H [51 [TT1 [T2]. For example, predictors in a descendant 
may derive from multiple ancestors. We can think of the 
mixing of predictors by considering the change in predic- 
tor values when derived from different sources. In some 
cases, we may wish to alter the assignment of descen- 
dants to ancestors. For example, a behavior may influ- 
ence the frequency of nondescendant types. To associate 
the behavioral phenotype with the change in frequency, 
we could assign those nondescendants to the ancestral 
behavior responsible for their presence |13j . In general, 
we can make such assignments in any way that we choose. 
The key is that assigning different descendants to an an- 
cestor may alter the change in predictor values between a 
descendant and its assigned ancestor. Such changes may 
alter the fidelity at which information is transmitted [5j. 
I will take up that topic in the next article [3]. 

The part transmitted and the change during transmission 

The full, exact expression from eqn [3] for the total evo- 
lutionary change is 

Az = &Qi z i + <tA z i- 

We can partition phenotype as z = g + 5, the split be- 
tween the part explained by the predictors of phenotype, 
g, and the part that is not explained by the set of pre- 
dictors in our model for phenotype, S. From eqn |13[ 
Az = Ag because S = 0, thus 

Az = Ag = ^Qi9i + «* A «- 
With gi = Zi — Si, we get 

Az = Y A <ZA - A< ^ + £ 9- A 5l . (15) 

We can express each of these terms with a particular 
notation that emphasizes its interpretation 

Az = A s z - A n z + A t z. (16) 

On the right side, the terms are the change caused by se- 
lection, the change caused by the part of phenotype that 
is not associated with a transmitted predictor, and the 
change in the effects of the predictors during transmis- 
sion. 



HERITABILITY AND INFORMATION 

This section focuses on the amount of information that 
populations accumulate by selection, and the various pro- 



Box 7. Brief history of evolutionary partitions 



Fisher [51 [H] partitioned phenotype into its various genetic 
causes. Quantitative genetics extended the partitioning of 
phenotype by genetic and nongenetic causes [3[S]. Models of 
cultural evolution use culturally transmissible attributes as 
predictors of phenotype 15-417] . 

Quantitative genetic models may also consider partitions 
of fitness into component causes. Recent work on partitions 
of fitness was stimulated by Lande and Arnold |18| . Many 
subsequent studies expanded that approach, including vari- 
ous explicit descriptions based on path analysis [19H23| . I 
unified the different lines of study on partitions of phenotype 
and partitions of fitness [HE], motivated initially by Queller's 
quantitative genetic models of kin selection |24l 125] . 

In the text, I mentioned that rB — C > can sometimes 
be interpreted in terms of group selection. For example, if 
neighbors' phenotype, y, is an average character value in a 
local group, then r can be defined as the regression of indi- 
vidual character value on group character value. That group 
regression can be considered in a path analysis model, which 
is roughly the way in which Heisler and Damuth [T5] ana- 
lyzed group selection. In their article, they emphasized "con- 
textual analysis" similarly to the way in which I have em- 
phasized "path analysis". Frank [5B] and Taylor and Frank 
[27] also calculated r by regressing group value on individual 
value in several models, following a long tradition that blurred 
the mathematical distinction between kin and group selection 

[SUES]. 

Some of the multivariate analyses of fitness attempt to 
predict evolutionary dynamics, and therefore must make ex- 
plicit assumptions about the distribution of phenotypes and 
the nature of heritability. I do not discuss dynamics; my 
models do not require any of those extra assumptions. 



cesses that degrade or alter the transmission of that infor- 
mation. Some of the forms given here include the classic 
genetic measures of heritability as special cases. How- 
ever, I do not emphasize those connections. Rather, I 
focus on general expressions given in terms of the full 
Price equation for total evolutionary change and based 
on predictors that may be chosen in any way. Different 
problems and goals will lead one to choose different sets of 
predictors or underlying causal schemes for phenotypes. 
The results here apply to any choice of predictors and 
causal scheme. 

We start with eqn |15[ the partition of phenotypic 
change into components 

Az = J2 A 9A " J2 AqA + J2 ^ A 9i- 

The first term on the right side is the selection compo- 
nent, A s z. From eqn |ll[ A s z — a z J, where a z changes 
scale between phenotype, z, and the gain in information 
by selection, J. Thus, 

Az = a z J - Y AgA + Y 

Here, selection happens in the initial (parental) popula- 
tion, causing a gain in information, J. On the pheno- 
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typic scale, that gain in information is a z J. The remain- 
ing terms include processes that cause loss of information 
during transmission or cause other changes to phenotype. 



The part of phenotype not transmitted 

Start by assuming that the predictors and their effects 
do not change during transmission, Ag.; = 0. That as- 
sumption reduces total change to 



A h z = a z J - ^ ^li^i 



where A/, = A s — A„ denotes the heritable component 
of selection, which is the total selection, A s , minus the 
part of selective change that is not associated with pre- 
dictors, A„. The part not associated with predictors is 
not explicitly transmitted within the given model of phe- 
notype. 

The second term, ^AqiSi, has the general form 
(eqn 1 1 ) of the change in information 



AqiZi = a z J, 



which holds for any choice of z. Thus, letting z = 5, 
we obtain AqiSi = as J. Putting this into the original 
expression yields 

A h z = a z J - a$ J = (a z - as) J. 

The scale change terms, a, have the important additivity 
property that, in general, a a + ab — a a +f>- Thus, 

a z - as — a z s = a g , 

because g — z — S. The expression for the change in 
phenotype, ignoring the change during transmission in 
the predictors and their effects, is 



Akz 



a g J. 



(17) 



This expression is the information gain by selection, J, 
scaled by a g , which relates the predicted phenotype, g, to 
the information accumulated by selection. Because g = 
z — 5, we see that the amount of information transmitted 
is degraded by S, the fraction of the phenotype, z, that 
is not explained by the predictors. 



of the predictors. A predictor's effect is its associated 
multiple regression coefficient. Multiple regression coef- 
ficients often change with context. On the one hand, the 
true underlying causal effect may change. On the other 
hand, our model of causality may not be exactly right, in 
which case shifting context will cause the assigned role 
of different predictors to change, even though the un- 
derlying causal effects of those predictors may not have 
changed. 

Various approaches may be taken to evaluate the ac- 
curacy of the causal model, such as the stability of the 
predictor effects under changing context [T]. Typically, a 
better causal model has predictors with greater stability, 
shifting the components of total change more strongly 
to the a g J information term. That increase in the in- 
formation term is usually advantageous with respect to 
interpretation, because it is often hard to evaluate the 
meaning of changes in predictors and their effects in the 
second term. 

Suppose, for example, that a significant component of 
phenotype is not explained by a stable set of predictors. 
Is the information accumulated by selection in the initial 
population lost during transmission because it is not as- 
sociated with any transmissible component? Or, is that 
information transmitted by other predictors that are not 
included in our model? If the information does transmit 
by predictors not in our model, that information con- 
tributes to the second term with changing values of the 
predictors and their effects. Such changes are hard to in- 
terpret, because many different processes can potentially 
alter the predictors and their effects. 

These fundamental equations of selection and evolu- 
tion are, in a way, rather arbitrary, because they depend 
so strongly on the particular set of predictors that one 
chooses. What can we conclude? First, the equations are 
always true, and so give us a clear sense of the essential 
nature of selection, information and evolution. Second, a 
key part of understanding any problem concerns choosing 
the right set of predictors. Third, simple genetic models 
provide a good starting point in many cases, but rarely 
define a complete set of predictors and an accurate ex- 
pression of causality. If one is able to model the causal 
scheme well, the analysis will often be simple and natural. 
I have emphasized a path analysis interpretation for the 
regression expressions, because path analysis emphasizes 
the choice of a good causal model. 



Change in transmitted components of phenotype 

When we add back the remaining term to eqn [T7J we 
obtain the full expression for phenotypic change as 

Az = a g J + ^q' i Ag i . 

The last term is the change in the transmitted compo- 
nents of phenotype. From eqn |14[ those components in- 
clude changes in the predictors and changes in the effects 



Fisher's fundamental theorem 

If we hold the predictors and their effects constant, 
then using eqn |17[ the change in mean log fitness is 

A h rh = a g J 

for m = g + S. This expression for change in fitness, 
holding constant the predictors and their average effects, 
provides a generalization of Fisher's fundamental theo- 
rem of natural selection. Fisher used the presence or 
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absence of allelic types as predictors, and the associated 
value of predicted fitness, g, as the genie value of fitness. 
With those definitions, the expression here is equivalent 
to Fisher's theorem. To translate back to the particu- 
lar notation that Fisher used, one would translate the 
definitions for a g and J into Fisher's forms. Frank [4] 
provides the tools for the translation, following Price [3"U] 
and Ewens [31 . The point here is that Fisher's theorem 
holds for any choice of predictors, as emphasized in Frank 



CAUSES OF FITNESS 

The expression A s z = a z J associates the accumula- 
tion of information by selection, J, with the selective 
component of phenotypic change. But that expression 
does not tell us why the association occurs. The phe- 
notype may directly influence fitness. Alternatively, the 
phenotype may have no direct effect on fitness, but in- 
stead may be associated with some other process that 
influences fitness. A significant part of evolutionary anal- 
ysis concerns evaluating the causes of fitness (Box[7|. 

We may analyze the causes of fitness in the same way 
that we analyzed the causes of phenotype. We write our 
model, or hypothesis, for the causes of fitness as the re- 
gression equation 



constant, and a = f3 wy . z is the average effect of y on w 
holding z constant, thus 



E 



(18) 



Here, <f> is the baseline fitness when all other terms are 
zero; 7r is the average direct effect of the phenotype z on 
fitness, holding constant the other predictors of fitness; 
and afc is the average effect of the other predictors of fit- 
ness, yt- We may use any number of other predictors, and 
those predictors may be defined in any way, including fac- 
tors in the model for phenotype. For example, predictors 
can be alleles, nonlinear interactions between combi- 
nations of alleles, symbionts, maternal effects, cultural or 
environmental attributes, other phenotypes, phenotypes 
of neighbors, and so on. The residual, e, is the difference 
between the predicted value of fitness for a given set of 
predictors and the actual fitness. 



A simple example 

To study the role of different predictors of fitness, it 
is useful to reduce the model to just the direct effect, z, 
and one indirect effect, y, yielding 

w = 4> + irz + ay + e. 

In this partial regression equation, it is helpful to write 
out the regression coefficients in full notation to empha- 
size their interpretation. The partial regression coeffi- 
cient 7r = j3 wz . y is the average effect of z on w holding y 



W = 4> + /3 WZ .yZ + /3 U 



(19) 



Condition for the increase of a phenotype by 
selection 



Using the standard covariance form for selection based 
on cqn|6j the partial change in z caused by selection is 

wA s z — Cov(w, z), 

which simply states that z increases by selection when 
it is positively associated with fitness. However, we now 
have the complication shown in eqn [19] that fitness also 
depends on another predictor, y. If we expand the co- 
variance using the full expression for fitness in eqn 19 we 
obtain 

wA s z = /3 wz . y V z + /3 wy . z Cov{y, z). 

If we replace the covariance term by the product of a 
regression coefficient and a variance, fi yz V z , we have 



A s z = {(i wz . y + (i wy . z (i yz ) V z / w . 



(20) 



The condition for the increase of z by selection is A s z > 
0. The same condition using the terms on the right side 
is 

l^yzPwy-z ~t~ fiwz-y 0. (21) 

Let us use an abbreviated notation for the three terms 



Pwy-z B 



wz-y 



= -c. 



The first term, f3 yz — r, describes the association between 
the phenotype, z, and the other predictor, y. An increase 
in z by the amount Az corresponds to an average increase 
of y by the amount (see Box [4]) 

Ay = rAz. 

The second term, j3 wy . z — B, describes the direct effect 
of the other predictor, y, on fitness, holding constant 
the focal phenotype, z. The third term, j3 wz . y = —C, 
describes the direct effect of the phenotype, z, on fitness, 
w, holding constant the effect of the other predictor, y. 

Using the abbreviated notation, the condition for the 
increase in z by selection is 

rB - C > 0. 

The following sections interpret this condition in terms 
of three different biological scenarios. 
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FIG. 1. Path diagrams for the effects of phenotype, z, and secondary predictor, y, on fitness, w. (a) An unknown cause associates 
y and z. The arrow connecting those factors points both ways, indicating no particular directionality in the hypothesized causal 
scheme, (b) The phenotype, z, directly affects the other predictor, y, which in turn affects fitness. The arrow pointing from z 
to y indicates the hypothesized direction of causality. 



Interactions between two species 

I trace the effects of phenotype z in species A and 
phenotype y in species B on the fitness of types from 
species A [3"%ll34| . One may think of species B as an 
ecological partner that can influence the fitness of types 
from species A. Here, fitness always refers to effects on 
species A. 



Unknown cause of association 

I follow the path diagram in Fig. la. Increases in the 
phenotype, z, by an amount Az, reduce fitness by — CAz. 
Increases in the phenotype y directly benefit fitness by 
BAy. The z and y phenotypes are associated by r, al- 
though no specific cause is known. It may be that similar 
phenotypes tend to settle in the same area, or that a com- 
mon environment of temperature and nutrients causes a 
phenotypic association. In any case, as z increases, the 
associated value of y changes on average by Ay = rAz 
and, equivalently, BAy = rBAz. 

Tracing the pathways in Fig. la, an increase in the 
direct phenotype by Az causes a change in fitness by 
(rB — C)Az, which is greater than zero when rB — C > 0. 
Thus, selection may favor an increase in z even though 
z directly decreases fitness, because the benefit from 
species B's phenotype, y, in proportion to rB, may out- 
weigh the direct cost, — C. 



Direct cause of association 

Alternatively, suppose that the phenotype z directly 
enhances the vigor of its partners from species B. That 
direct effect of z on species B causes an increase in the 
benefit, y, that species B provides back to those with 
phenotype z. Fig. lb shows this direct cause of y by z. 



The condition for z to be positively associated with fit- 
ness and to increase by selection remains rB — C > 0. 
However, the interpretation differs. In this case, z di- 
rectly influences its neighbors' phenotype, y, rather than 
being associated with y by some unknown cause. 



Body temperature 

Suppose z is body temperature, which imposes a direct 
effect —Cz on fitness. That direct cost may arise because 
body temperature raises the rate at which energy is used. 
Let y be speed of response to a challenge, such as a preda- 
tor attack. Faster response provides a direct benefit, By. 
An unknown cause may associate temperature, z, and 
response rate, y, by an amount r (Fig. la). For exam- 
ple, sunshine may directly raise temperature and simul- 
taneously increase response to attack by providing better 
visual opportunities. Alternatively, temperature, z, may 
directly raise response rate, y, by increasing the respon- 
siveness of muscles (Fig. lb). In either case, selection 
favors an increase in body temperature if rB — C > 0. 



Social evolution and group selection 

The phenotype z may be a costly altruistic behavior 
that helps neighboring individuals [5] [T31 [Ml ■ The 
direct effect on fitness is — Cz. Neighbors have pheno- 
type y that provides a benefit, By, back to the original 
individual. An association, r, between z and y may arise 
in a variety of ways. 

Some unknown cause may associate z and y (Fig. la). 
For example, shared cultural, environmental or genetic 
variation may cause related behavior. Or a shared sym- 
biont may cause an association. In general, any associa- 
tion in the predictors of phenotype will cause an associ- 
ation of phenotypic values. 
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In other cases, the altruistic phenotype, z, may directly 
enhance neighbors' beneficial behavior, y, in proportion 
to r (Fig. lb). For example, the level of y in the neighbors 
may depend on the probability of the neighbors' survival. 
If an increase in z raises neighbors' survival in proportion 
to r, that increase in survival enhances the expression of 
the neighbors' behavior, y, which has a beneficial effect 
on fitness of By. 

Whether r arises from unknown causes (Fig. la) or 
from the direct effect of z on y (Fig. lb), we can trace 
the effect of an increase in z on fitness. The condition 
for an increase in z to raise fitness is rB — C > 0. 

In some cases, we may interpret the condition rB—C > 
in terms of group selection |28j . For example, z may 
measure individual restraint in the harvesting of non- 
renewable resources |26| . Greater restraint reduces the 
direct benefit to the individual, because it means less re- 
source harvested, with an effect on fitness of — Cz. Neigh- 
bors' phenotype, y, may be the average restraint among 
individuals in a local group with regard to harvesting 
nonrenewable resources. 

Greater group restraint provides a benefit to all mem- 
bers of the group, including our focal individual, by pro- 
viding greater local productivity through maintenance of 
nonrenewable resources. The benefit of group restraint 
on individual fitness is By. The association between an 
individual's phenotype, z, and the group phenotype, y, 
is r. Thus, when rB — C > 0, individual restraint evolves 
and provides a joint benefit to all group members. Here, 
the two predictors of fitness are individual behavior, z, 
and average group behavior, y. This type of group se- 
lection is just a special case of partitioning the causes of 
fitness, in which one of the predictors is a group attribute 
(Box [7). 



CAUSAL STRUCTURE 

All of these examples share a common causal struc- 
ture. We are interested in the change in a phenotype, z, 
caused by selection. Fitness depends on two predictors: 
the phenotype of interest, z, and another predictor, y. In 
all cases, the condition for the increase in z by selection 
is rB — C > 0. This condition is just the partition of the 
causes of fitness into two components. The direct effect 
on fitness of z is — C, and the direct effect of y is B. We 
multiply y by r to change the scale of the effect from y 
to z, because the net effect must be the relation between 
z and fitness, w. 

We can see the logical relations and the units for the 
various scales by writing out the full notation 



rB-C = p yz f3 wy . z +, 



(22) 



Box [3] shows that a regression coefficient, (3 xy , has units 
Air /Ay. Taking the terms of the above equation in order 
from left to right, the units arc 



Ay Aw Aw _ Aw 



The ratio Aw/Az is the change in fitness, w, per unit 
change in the phenotype, z. That ratio is the slope of fit- 
ness on phenotype. When the slope is positive, selection 
favors the increase of the phenotype. In any analysis of 
this sort, the term 



= Py Z = 



Ay 
Az 



(24) 



rescales changes of the secondary predictor, Ay, with re- 
spect to changes in the primary scale, Az. 

The key point is that rB — C > simply partitions fit- 
ness into the direct effect of a phenotype plus the indirect 
effect through a secondary predictor. The true causal 
structure will, of course, frequently depend on multiple 
secondary causes, as in eqn 
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Multiple causes lead to 
an expanded expression for the increase of z caused by 
selection, A 



nB t - C > 0, 



in which each n is the regression of yi on z, and each Bi 
is the partial regression of w on yi holding constant the 
other factors. One may also need to consider cascading 
causes or hidden factors in the sense of path analysis [T] . 
The simple expression rB—C > should be thought of as 
a convenient example to illustrate the logic of partition- 
ing the causes of fitness, or as the expression of simplified 
models that isolate two opposing processes. 

In this section, I have analyzed the partitioning of fit- 
ness. I have not discussed the partition of phenotype 
into components, z = g + 5, where g is the sum of the 
predictors of phenotype. The amount of information ac- 
cumulated by selection that can be transmitted depends 
on the slope of fitness, w, relative to the transmissible 
predictors of phenotype, g. If we think of g in terms of 
the genetic predictors of phenotype, then r can be inter- 
preted as a genetic relatedness coefficient, and rB—C > 
calls to mind Hamilton's rule from the theory of kin selec- 
tion |13) . The next article takes up the relations between 
kin selection and the general analysis of the causes of fit- 
ness and the causes of phenotype [3] . A full evolutionary 
analysis also requires attention to other causes of change, 
A t z, in eqn 16 [H[S]. 



Az Ay Az 



A.: 



(23) 



It is important to relate the causes of fitness to infor- 
mation, which is the ultimate scale for selection. Box [8] 
connects the partitions of fitness in this section to the 
expressions of information given earlier in this article. 
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Box 8. Information and the causes of fitness 



Changes caused by selection can always be related to the 
change in information accumulated by the population. For 
example, the change in phenotype caused by selection from 
eqn [II] is 

A s z = a z J, 

where J is the change in information by selection, and a z 
relates the scale of information to the scale of phenotype. We 
can examine the units of the scaling term 

which is the ratio of two regression coefficients (eqn |10[). A 
regression coefficient, /3 zy , has units Az/Ay, when used as a 
scaling relation for changes in average values (Box[3|. Thus, 
the units for the scaling relation, a z , are 

/3 ZW _ Az Aw Az 
f3 mw Aw Am Am 

The term Am has units of change in log fitness. Changes in 
log fitness are equivalent to changes in information, J (eqn[8|. 
To emphasize that J is a change in information, write the 
units on J as AI. Thus, the scaling factor 

Az 

a * = AI 

is the change in phenotype relative to the change in informa- 
tion. 

One must learn to read the regression coefficients as scal- 
ing factors that change units. Once one learns to recognize 
the scale changes, and the key units such as information and 
phenotype, the fundamental equations can be read like a sen- 
tence. When analyzing selection, I prefer information as the 
ultimate scale, because selection is the process by which pop- 
ulations accumulate information. 

With that background, I present a long sentence to trans- 
late the causes of fitness into an expression for the change in 
information. Start with eqn |20| and divide both sizes by a z , 
yielding 

* _ A S Z / fiw Z -y ~t~ $wy-z$y Z \ Vz 

l\ s m = = Pmiv I 7) I _ ■ 

a z \ p zw ) w 

The units are 

a z \Az J 

the change in information by selection. All of the regression 
coefficients in the prior equation change scales for the vari- 
ous terms, and we also have V z /w, which has units Az 2 /Aw. 
The net units of the long right side are AI, the change in 
information. The right side appears complex. But each term 
has a simple, readable meaning with respect to the effect of a 
predictor on fitness, and the scale changes required to trans- 
form those effects into the common units of information. To 
understand selection, we often need to decompose fitness and 
phenotypes into their component causes. Such decomposi- 
tion requires that we combine all the components properly to 
recover the correct scale of analysis. 



DISCUSSION 

I first partitioned phenotype with respect to a set of 
hypothesized causes. I then partitioned fitness with re- 
spect to a different set of hypothesized causes. Finally, 
I placed those partitions of phenotype and fitness into a 
general expression for selection and evolutionary change. 
Those steps allowed me to express heritability, selection 
and evolutionary change in terms of causal components. 

I also translated the standard expressions of selection 
and evolution, given in terms of regressions, covariances 
and variances, into expressions for the change in infor- 
mation. In my view, selection is best interpreted as the 
accumulation of information by populations [BJ. Other 
evolutionary processes often cause a decay in the trans- 
mission of information. The information expressions al- 
low one to read the equations of selection and evolution 
as if they were sentences. Those sentences express the 
fundamental relations between the causes of phenotypes 
and fitness and the consequences for the change in infor- 
mation by evolutionary processes. 

I showed that the commonly used regressions coeffi- 
cients in models of selection and evolution can be under- 
stood as coefficients for the change in scale with respect 
to the ultimate scale of information (Box For ex- 
ample, the change in a phenotype caused by selection 
can be understood as a rescaling of the change in infor- 
mation accumulated by selection. Certain measures of 
heritability, often expressed as regression coefficients, are 
the change in the scaling of information from one pheno- 
type to another. For example, a parent-offspring regres- 
sion may describe the change in scale between parent 
and offspring phenotype with respect to the underlying 
information content in those phenotypes. 

My extended development in terms of causal compo- 
nents and information may, at first, seem like a lot of 
technical complication. We are, after all, simply model- 
ing selection, heritability and other widely studied evolu- 
tionary processes. Many models of those processes seem 
more direct and concise. My goal is to go beyond com- 
mon calculations or common applications. The more ab- 
stract and exact models here provide a conceptual guide 
for understanding how selection actually works, how pop- 
ulations accumulate information, and how that informa- 
tion is transmitted or lost. 

I have also traded the certainty of the standard models 
of genetics for the uncertainty that arises when we freely 
choose our predictors as causal hypotheses. In my view, 
the apparent certainty of genetics is often misleading. We 
know that many factors influence phenotypes in addition 
to the narrowly defined allelic types of genes. Tradition- 
ally, a specific extended model deals with each additional 
factor: cytoplasmic inheritance, nonlinear genetic inter- 
actions, maternal effects, social interactions, and so on. 
By describing each of those aspects as a special situation, 
one ends up with a catalog of special models. 

The models here show how to think in general about 
a variety of causal structures. Those models are only as 
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good as the particular hypothesized system of causality 
that we choose. But that is also true for genetic models 
and for every other model, whether or not we admit it 
openly. Here, I have traded the false sense that there 
are a few standard models for the more realistic view 
that one has to bring a good hypothesis to an analysis 
in order to get a good understanding of phenotypes and 
selection. 

Hamilton [13] made clear the central role of causal 
analysis in kin selection theory 

Considerations of genetical kinship can give a 
statistical reassociation of the [fitness] effects 
with the individuals that cause them. 

The seemingly endless debates about kin selection arise 



from failure to recognize that the theory is ultimately 
a way of framing causal hypotheses [H [5] . The follow- 
ing article develops kin selection as a method of causal 
modeling. 
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